Pattern Directed Mining of Sequence Data

نویسندگان

  • Valery Guralnik
  • Duminda Wijesekera
  • Jaideep Srivastava
چکیده

Sequence data arise naturally in many applications, and can be viewed as an ordering of events, where each event has an associated time of occurrence. An important characteristic of event sequences is the occurrence of episodes, i.e. a collection of events occurring in a certain pattern. Of special interest axe ~r~uent episodes, i.e. episodes occurring with a frequency above a certain threshold. In this paper, we study the problem of mining for f~equent episodes in sequence data. We present a framework for efficient mining of frequent episodes which goes beyond previous work in a number of ways. First, we present a language for specifying episodes of interest. Second, we describe a novel data structure, called the sequential pattern tree (SP Tree), which captures the relationships specified in the pattern language in a very compact manner. Third, we show how this data structure can be used by a standard bottomup mining algorithm to generate frequent episodes in an efficient manner. Finally, we show how the SP Tree can be optimized by sharing common conditions, and evaluating each such expression only once. We present the results of an evaluation of the proposed techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

A Proposition for Sequence Mining Using Pattern Structures

In this article we present a novel approach to rare sequence mining using pattern structures. Particularly, we are interested in mining closed sequences, a type of maximal sub-element which allows providing a succinct description of the patterns in a sequence database. We present and describe a sequence pattern structure model in which rare closed subsequences can be easily encoded. We also pro...

متن کامل

Survey on Sequence Discovery Using Dna Sequence Mining Data

Sequence Mining is one of the most commonly used technique in data mining. Sequence mining is the process of mining frequent patterns from a large datasets. The exiting algorithms have some limitations in predicting frequent patterns, in terms of time, space complexity and accuracy. To overcome these drawbacks, in this paper made a study on existing sequence mining algorithms and generate a new...

متن کامل

Establishing relationships among patterns in stock market data

Similarities among subsequences are typically regarded as categorical features of sequential data. We introduce an algorithm for capturing the relationships among similar, contiguous subsequences. Two time series are considered to be similar during a time interval if every contiguous subsequence of a predefined length satisfies the given similarity criterion. Our algorithm identifies patterns b...

متن کامل

Behaviour Recovery and Complicated Pattern Definition in Web Usage Mining

Data mining includes four steps: data preparation, pattern mining, and pattern analysis and pattern application. But in web environment, the user activities become much more complex because of the complex web structure. So user behaviours recovery and pattern definition play more important roles in web mining than other applications. In this paper, we gave a new view on behaviour recovery and c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998